智能论文笔记

Wireless Channel Prediction in Partially Observed Environments

Mingsheng Yin , Yaqi Hu , Tommy Azzino , Seongjoon Kang , Marco Mezzavilla , Sundeep Rangan

分类：机器人 | 机器学习

2022-07-03

位点特异性的射频（RF）传播预测越来越多地取决于由相机和LIDAR传感器等视觉数据构建的模型。在动态设置中运行时，只能部分观察到环境。本文介绍了一种提取统计通道模型的方法，鉴于对周围环境的部分观察。我们提出了一种简单的启发式算法，该算法在部分环境上执行射线跟踪，然后使用机器学习训练的预测指标来估算从部分射线跟踪结果中提取的功能中估算该通道及其不确定性。结果表明，当没有部分信息可用并完全观察到环境时，提出的方法可以在完全统计模型之间插值。该方法还可以根据已探索的区域数量来捕获传播预测的不确定性程度。该方法在在一组室内图上模拟的机器人导航应用程序中进行了证明，该应用程序使用最先进的导航，同时定位和映射（SLAM）和计算机视觉方法构建的详细模型。

translated by 谷歌翻译

Millimeter Wave Wireless Assisted Robot Navigation with Link State Classification

Mingsheng Yin , Akshaj Veldanda , Amee Trivedi , Jeff Zhang , Kai Pfeiffer , Yaqi Hu , Siddharth Garg , Elza Erkip , Ludovic Righetti , Sundeep Rangan

分类：机器人

2021-10-27

由于捕获高角度和时间分辨率测量的能力，毫米波（MMWAVE）带引起了高精度定位应用的显着关注。本文探讨了基于MMWAVE的定位，用于目标本地化问题，其中固定目标广播MMWAVE信号和移动机器人代理尝试侦听信号以定位和导航到目标。提出了三个韵律过程：首先，移动代理使用张量分解方法来检测无线路径及其角度。其次，然后使用机器学习培训的分类器来预测链路状态，这意味着如果最强的路径是视线（LOS）或非LOS（NLO）。对于NLOS案例，链路状态预测器还确定最强路径是否通过一个或多个反射到达。第三，基于链路状态，代理人遵循估计的角度或探索环境。该方法在补充有线跟踪的室内环境的大型数据集上进行了演示，以模拟无线传播。路径估计和链路状态分类也集成到最先进的神经同时定位和映射（SLAM）模块中，以增强相机和基于LIDAR的导航。结果表明，链路状态分类器可以成功地推广到培训集外的完全新环境。另外，具有无线路径估计和链路状态分类器的神经基模块为目标提供快速导航，接近了解目标位置的基线。

translated by 谷歌翻译

Policy evaluation from a single path: Multi-step methods, mixing and mis-specification

Yaqi Duan , Martin J. Wainwright

分类： (统计)机器学习 | 机器学习

2022-11-07

We study non-parametric estimation of the value function of an infinite-horizon $\gamma$-discounted Markov reward process (MRP) using observations from a single trajectory. We provide non-asymptotic guarantees for a general family of kernel-based multi-step temporal difference (TD) estimates, including canonical $K$-step look-ahead TD for $K = 1, 2, \ldots$ and the TD$(\lambda)$ family for $\lambda \in [0,1)$ as special cases. Our bounds capture its dependence on Bellman fluctuations, mixing time of the Markov chain, any mis-specification in the model, as well as the choice of weight function defining the estimator itself, and reveal some delicate interactions between mixing time and model mis-specification. For a given TD method applied to a well-specified model, its statistical error under trajectory data is similar to that of i.i.d. sample transition pairs, whereas under mis-specification, temporal dependence in data inflates the statistical error. However, any such deterioration can be mitigated by increased look-ahead. We complement our upper bounds by proving minimax lower bounds that establish optimality of TD-based methods with appropriately chosen look-ahead and weighting, and reveal some fundamental differences between value function estimation and ordinary non-parametric regression.

translated by 谷歌翻译

Efficient and Accurate Quantized Image Super-Resolution on Mobile NPUs, Mobile AI & AIM 2022 challenge: Report

Andrey Ignatov , Radu Timofte , Maurizio Denna , Abdel Younes , Ganzorig Gankhuyag , Jingang Huh , Myeong Kyun Kim , Kihwan Yoon , Hyeon-Cheol Moon , Seungho Lee

分类：计算机视觉

2022-11-07

Image super-resolution is a common task on mobile and IoT devices, where one often needs to upscale and enhance low-resolution images and video frames. While numerous solutions have been proposed for this problem in the past, they are usually not compatible with low-power mobile NPUs having many computational and memory constraints. In this Mobile AI challenge, we address this problem and propose the participants to design an efficient quantized image super-resolution solution that can demonstrate a real-time performance on mobile NPUs. The participants were provided with the DIV2K dataset and trained INT8 models to do a high-quality 3X image upscaling. The runtime of all models was evaluated on the Synaptics VS680 Smart Home board with a dedicated edge NPU capable of accelerating quantized neural networks. All proposed solutions are fully compatible with the above NPU, demonstrating an up to 60 FPS rate when reconstructing Full HD resolution images. A detailed description of all models developed in the challenge is provided in this paper.

translated by 谷歌翻译

Learned Smartphone ISP on Mobile GPUs with Deep Learning, Mobile AI & AIM 2022 Challenge: Report

Andrey Ignatov , Radu Timofte , Shuai Liu , Chaoyu Feng , Furui Bai , Xiaotao Wang , Lei Lei , Ziyao Yi , Yan Xiang , Zibin Liu

分类：计算机视觉

2022-11-07

The role of mobile cameras increased dramatically over the past few years, leading to more and more research in automatic image quality enhancement and RAW photo processing. In this Mobile AI challenge, the target was to develop an efficient end-to-end AI-based image signal processing (ISP) pipeline replacing the standard mobile ISPs that can run on modern smartphone GPUs using TensorFlow Lite. The participants were provided with a large-scale Fujifilm UltraISP dataset consisting of thousands of paired photos captured with a normal mobile camera sensor and a professional 102MP medium-format FujiFilm GFX100 camera. The runtime of the resulting models was evaluated on the Snapdragon's 8 Gen 1 GPU that provides excellent acceleration results for the majority of common deep learning ops. The proposed solutions are compatible with all recent mobile GPUs, being able to process Full HD photos in less than 20-50 milliseconds while achieving high fidelity results. A detailed description of all models developed in this challenge is provided in this paper.

translated by 谷歌翻译

CTooth+: A Large-scale Dental Cone Beam Computed Tomography Dataset and Benchmark for Tooth Volume Segmentation

Weiwei Cui , Yaqi Wang , Yilong Li , Dan Song , Xingyong Zuo , Jiaojiao Wang , Yifan Zhang , Huiyu Zhou , Bung san Chong , Liaoyuan Zeng

分类：人工智能 | 计算机视觉

2022-08-02

准确的牙齿体积分割是计算机辅助牙齿分析的先决条件。基于深度学习的牙齿分割方法已经达到了令人满意的表现，但需要大量的牙齿数据。公开可用的牙科数据是有限的，这意味着无法在临床实践中复制，评估和应用现有方法。在本文中，我们建立了一个3D Dental CBCT数据集Ctooth+，具有22个完全注释的卷和146个未标记的体积。我们进一步评估了基于完全监督的学习，半监督学习和积极学习的几种最先进的牙齿量细分策略，并定义了绩效原则。这项工作为牙齿体积分割任务提供了新的基准，该实验可以作为未来基于AI的牙科成像研究和临床应用开发的基线。

translated by 谷歌翻译

DU-Net based Unsupervised Contrastive Learning for Cancer Segmentation in Histology Images

Yilong Li , Yaqi Wang , Huiyu Zhou , Huaqiong Wang , Gangyong Jia , Qianni Zhang

分类：计算机视觉 | 人工智能

2022-06-17

在本文中，我们引入了一个无监督的组织学图像癌症分割框架。该框架涉及一种有效的对比度学习方案，用于提取独特的视觉表示以进行分割。编码器是一个深的U-NET（DU-NET）结构，与正常的U-NET相比包含一个额外的完全卷积层。开发了一种对比学习方案，以解决缺乏对肿瘤边界高质量注释的训练集的问题。采用了一组特定的数据增强技术来提高对比度学习的学习颜色特征的可区分性。使用卷积条件随机场进行平滑和消除噪声。该实验表明，比某些受欢迎的监督网络更好地表明了分割的竞争性能。

translated by 谷歌翻译

CTooth: A Fully Annotated 3D Dataset and Benchmark for Tooth Volume Segmentation on Cone Beam Computed Tomography Images

Weiwei Cui , Yaqi Wang , Qianni Zhang , Huiyu Zhou , Dan Song , Xingyong Zuo , Gangyong Jia , Liaoyuan Zeng

分类：计算机视觉 | 人工智能

2022-06-17

3D牙齿分割是计算机辅助牙齿诊断和治疗的先决条件。但是，将所有牙齿区域分割为主观且耗时。最近，基于深度学习的细分方法产生了令人信服的结果并减少了手动注释的工作，但是它需要大量的基础真相进行培训。据我们所知，3D分割研究几乎没有牙齿数据。在本文中，我们建立了带有牙齿金标准的完全注释的锥束计算机断层扫描数据集。该数据集包含22卷（7363片），并带有经验丰富的射线照相解释者注释的精细牙齿标签。为了确保相对的数据采样分布，数据方差包括在牙齿中，包括缺失的牙齿和牙齿修复。在此数据集上评估了几种最新的分割方法。之后，我们进一步总结并应用了一系列基于3D注意的UNET变体以分割牙齿。这项工作为牙齿体积分割任务提供了新的基准。实验证据证明，3D UNET结构的注意力模块增强了牙齿区域中的反应，并抑制背景和噪声的影响。 3D UNET使用SKNET注意模块实现了最佳性能，分别为88.04 \％骰子和78.71 \％IOU。基于注意力的UNET框架的表现优于Ctooth数据集上的其他最新方法。代码库和数据集已发布。

translated by 谷歌翻译

CDNet: Contrastive Disentangled Network for Fine-Grained Image Categorization of Ocular B-Scan Ultrasound

Ruilong Dan , Yunxiang Li , Yijie Wang , Gangyong Jia , Ruiquan Ge , Juan Ye , Qun Jin , Yaqi Wang

分类：计算机视觉

2022-06-17

B扫描超声模式中图像的精确和快速分类对于诊断眼部疾病至关重要。然而，在超声波中区分各种疾病仍然挑战经验丰富的眼科医生。因此，在这项工作中开发了一个新颖的对比度截面网络（CDNET），旨在应对超声图像中眼异常的细粒度图像分类（FGIC）挑战，包括眼内肿瘤（IOT），视网膜脱离（RD），后堆肥葡萄球菌（PSS）和玻璃体出血（VH）。 CDNET的三个基本组成部分分别是弱监督的病变定位模块（WSLL），对比度多Zoom（CMZ）策略和超级性对比度分解损失（HCD-LOSS）。这些组件促进了在输入和输出方面的细粒度识别的特征分离。所提出的CDNET在我们的ZJU Ocular Ultrasound数据集（Zjuuld）上进行了验证，该数据集由5213个样品组成。此外，在两个公共且广泛使用的胸部X射线FGIC基准上验证了CDNET的概括能力。定量和定性结果证明了我们提出的CDNET的功效，该CDNET在FGIC任务中实现了最新的性能。代码可在以下网址获得：https：//github.com/zeroonegame/cdnet-for-ous-fgic。

translated by 谷歌翻译

Plug-and-play Shape Refinement Framework for Multi-site and Lifespan Brain Skull Stripping

Yunxiang Li , Ruilong Dan , Shuai Wang , Yifan Cao , Xiangde Luo , Chenghao Tan , Gangyong Jia , Huiyu Zhou , You Zhang , Yaqi Wang

分类：人工智能 | 计算机视觉

2022-03-08

Skull stripping is a crucial prerequisite step in the analysis of brain magnetic resonance images (MRI). Although many excellent works or tools have been proposed, they suffer from low generalization capability. For instance, the model trained on a dataset with specific imaging parameters cannot be well applied to other datasets with different imaging parameters. Especially, for the lifespan datasets, the model trained on an adult dataset is not applicable to an infant dataset due to the large domain difference. To address this issue, numerous methods have been proposed, where domain adaptation based on feature alignment is the most common. Unfortunately, this method has some inherent shortcomings, which need to be retrained for each new domain and requires concurrent access to the input images of both domains. In this paper, we design a plug-and-play shape refinement (PSR) framework for multi-site and lifespan skull stripping. To deal with the domain shift between multi-site lifespan datasets, we take advantage of the brain shape prior, which is invariant to imaging parameters and ages. Experiments demonstrate that our framework can outperform the state-of-the-art methods on multi-site lifespan datasets.

translated by 谷歌翻译